Method for marking a compressed digital video signal

ABSTRACT

A method for marking a compressed digital video signal by embedding a digital signature in the compressed video signal, the signal representing a series of at least two video images, each image being divided into a plurality of regions, the signal including movement vectors representing the movement of the regions between the first and the second image, characterised in that it consists in modifying at least one of the coefficients X or Y of at least one of the movement vectors.

TECHNICAL FIELD

This invention concerns marking of video signals and in particular amethod for marking a compressed digital video signal by incorporating adigital signature embedded in the video signal.

The options available for transmitting images via new media such as theInternet and satellite television mean that the volume of dataexchanged, especially video images, will continue to increase.

In parallel, recent developments in the television and digital videosector not only afford an appreciable improvement in image quality butalso enable copies of quality identical to the original to be made, suchthat the copy is indistinguishable from the original.

This is why research has been undertaken with a view to enablingabsolute identification of the author of a video image, to enable him toenforce his rights with regard to unauthorised broadcasters.

Prior Art

Methods are known for the protection of video images consisting inincorporating a digital signal by slightly modifying the data in orderto be able to identify the said signature. This signature may then beused to identify the author of the image. Two main signature techniqueshave been used to date. The simplest entails modulating the brightnessof a pixel selected pseudo-randomly. This technique is used inparticular for grey scale images and is described in particular in thearticle by R. G. van Schnydel et al. (<<(Digital watermark>>,Proceedings of the 1994 1^(st) IEEE International Conference on ImageProcessing, Vol. 2, pp. 86-90, 1994).

More complex techniques have also been proposed in which the image isdivided into blocks with amplitude modulation and block size that are inturn modulated by local energy. Techniques suitable for black and whiteimages have been proposed by K. Matsui et al. (<<How to secretly imbed asignature on a picture,>> The Journal of the interactive MultimediaAssociation Intellectual Property Project, Vol. 1, No. 1, pp. 187-206,January 1994). In the article by W. Bender et al. (<<Techniques for DataHiding)>>, Proceedings of the SPIE, 2420:40, February 1995), thedifference between the brightness value of pixels is used. Whilst thislatter article proposes duplication of textured areas and then use of anauto-correlation calculation, a technique very often used ismodification of the DCT coefficients generated by JPEG encoders (S.Burgeft et al. <<A Novel Method for Copyright Labelling Digitized ImageData>>, IEEE Transactions on Communications, September 1994) or MPEG-2.These two techniques are well suited to colour images or animated colourimages respectively.

The number of bits generated by the images in the transmission of videoinformation requires a high compression rate. In this field, the MPEGformat has rapidly taken over and has become an international standard,with most professional equipment processing video signals in thisformat.

An identification technique specific to MPEG uses part of the compressedsignal for information designed to identify the supplier, as well as aso-called protection bit, indicating that the image is protected andthat copying is prohibited.

However, these factors do not offer any security as it is very easy todelete them as their position is clearly defined in the MPEG standard.

As mentioned previously, this invention concerns a method for marking acompressed digital video signal by incorporating a digital signatureembedded in the video signal.

This method is only of use if the signature cannot be deleted easily andif it can withstand handling such as compression and decompression, azoom function and image shift, this signature obviously not altering thevideo signal other than imperceptibly.

It is known that the MPEG standard is based on analysis of thedevelopment in the various frames and transmission of the differences.It has been noted in particular that, from one image to another, a largeamount of the information does not change, or is in a slightly differentplane. To this effect the MPEG encoder breaks down the image intoblocks, usually of 8 by 8 pixels, with comparisons then being made ofthese blocks. In particular, these operations concern DCT, DFD andmotion vectors. The most recent versions of MPEG enable a set of blocksto be joined and operations then carried out on this set of blockstermed a region.

The DCT (Discrete Cosinus Transform) is a transformation into a DiscreteCosinus which enables amplitudes of the different frequencies forming animage to be obtained. Its advantage lies in the fact that it is thenpossible to selectively compress the different frequency bands in orderto minimise the visual distortion. It is generally done on blocks ofreduced size (typically 8×8). This transformation is applied to all theblocks forming the image. It is followed by a quantification andentropic coding stage.

The DFD (Displaced Frame Difference) represents the difference betweenthe image predicted by the translational model and the actual image.

DCT coding is applied to transmit DFD type information.

Motion vectors are used when the same block is slightly offset from oneframe to the next. In this case, the MPEG format provides indicationthat a block present in the preceding image reappears in the currentimage slightly offset in plane, by a value corresponding to thecoefficients X and Y. This principle allows errors, i.e. if a small partof a new block recreated in this way is different, the motion vectorwill be accompanied by DFD data containing the differences between thedisplaced preceding block and the actual visual information of this newblock.

Previous marking experiments used DCT coefficients to incorporated thesignature in a compressed video signal. This technique is, however,highly susceptible to image framing alignment problems, and displacementof one or two pixels results in different DCT coefficients, whichprevents extraction of the signature. This disadvantage is exacerbatedfurther still in that certain video processing equipment causes offsetby one pixel on the video signal produced.

SUMMARY OF THE INVENTION

The purpose of this invention is to offer a method for marking acompressed digital video signal by incorporating a signature embedded inthe digital signal, which is robust, invisible and identifiable in realtime.

To this effect the invention concerns a method for marking a compresseddigital video signal by incorporating a digital signature embedded inthe compressed video signal, the said signal representing a series of atleast two video images, each of the images being divided into severalregions, the said signal comprising motion vectors representing motionof the regions between the first and the second image, the method beingcharacterised by the fact that at least one of the coefficients X or Yis modified on at least one of the said motion vectors, with a regionpossibly consisting of one block.

According to one embodiment, a set of motion vectors MV(i) is selectedwith a low visual impact, preferably a set of vectors with a standardlower than a threshold R, for example a threshold R equal to 5.

According to an initial variant, a set of vectors contained in a framepreceding a frame containing all the image information is selected (typeI frame).

According to a second variant, the set of motion vectors MV(i) ischanged according to a pseudo-random selection modulated by a code, thesignature possibly comprising several bits S(i) and at least one of thesaid coefficients X or Y of the motion vectors possibly being modifiedaccording to at least one of the signature bits, with the methodpossibly also comprising the following steps:

generation of a random number A initialised by a parameter transmittedby the video signal, comprising the same number of bits as S;

modification of one of the coefficients X or Y such that this is signedwith bit <<1>> if more than half the bits of A are identical to S orwith bit <<0>> if the reverse applies.

The parameter initialising the random number may be a mathematicalcombination of the standard of the motion vector to be modified.

The coefficients X or Y of the motion vectors can be modified betweenthe second and third frame in inverse proportion to the modificationmade between the

The advantage of the method as per the invention is that it is highlyrobust, since only the special knowledge of the original video signalallows to identify the vectors which have been modified. As the imageitself has been modified, further processing of this signal will notdelete its signature.

The method essentially consists of two stages: selection of regions andmodification of motion vectors.

DESCRIPTION OF THE METHOD

The description below is given as an example and refers to the drawingin which FIG. 1 is a diagram showing the steps of the method.

SELECTION OF REGIONS TO BE SIGNED

An initial selection of potential vector candidates for signature ismade using the parameters relating to the characteristics of the video.

It has been shown that visual perception easily accommodates to changesmade in moving regions. In view of this, it is useful to modify themotion vectors as these relate to moving regions. Modificationsgenerated by the coding method are embedded in the image dynamics. Also,by carefully selecting motion vectors, visibility of these modificationscan be minimised further still.

It should also be noted that this signature is distributed over theentire image concerned and is not concentrated in one region of thescreen as is found with other coding systems. Also, provided that thereare sufficient motion vectors, the signature is reproduced in severalcopies in one frame and this on each frame containing motion vectors. Ifthe number of vectors is less than the number of bits in the signature,this is distributed over several frames. The result is that even if anextract of the video signal is used the signature can be identified.

It would also be prejudicial to the visual effect to always modify thesame vectors in successive frames. Although the number of framesseparating two frames containing all the image information area isgenerally less than 10, error accumulations induced by the signature canbecome visible. To offset these problems, several solutions can beimplemented either individually or collectively:

select a frame containing motion vectors which precede a framecontaining the entire image;

select blocks where the modification will be scarcely perceptible. Forexample those of which the magnitude L of the vector is comprisedbetween some limits. This magnitude can be calculated as follows:L0=sup(X,Y) or L1=|X|+|Y| or L2=Sqrt(X*X+Y*Y) where Sqrt defines thesquare root function; and where X and Y are the coefficients of themotion vector;

select a set of vectors differing widely from one frame to another toavoid an accumulated deviation;

offset modification in the next frame, insofar as there is a vectoraffecting an identical block.

After selecting a set of vectors corresponding to the criteria set outabove, the next step is to incorporate the key which will define thevectors to be involved in the coding. This key must not only enable achoice of vectors but also the order of their involvement at the time ofcoding. To this effect, all the vectors are listed and the order oftheir interactions with the signature has to be modulated with the key.For this, a number of motion vectors are taken at random from theselection determined previously. The random generator is initialised bythe key. This initialisation can be effected by a so-called public key,common to all marked videos, or a so-called private key for uniqueidentification of this video sequence. It is of course possible to useboth keys to mark a video sequence.

Modification of Motion Vectors

Modification Method

A simple method consists in modifying the X coefficient or Ycoefficient, according to the bit to be embedded by replacing the leastsignificant bit of this coefficient. This coefficient will thus have aneven value if the bit is zero or an odd value if the bit is 1.

Direct Signature

According to the method as per the invention, a digital signature isincorporated, for example a word of 32 bits in a compressed videosignal. To this effect, the X or Y coefficients of the selected motionvectors are modified according to the value of the bit of the key. Toensure robust marking, 10 vectors can, for example, be selectedrepresenting block movements distributed over the entire surface of theimage and modified according to the signature value. This gives us theoption of incorporating 20 signature bits in the X and Y coefficients ifit is decided to modify both coefficients. If the number of bits exceedsthe number of coefficients likely to be marked in a given image, thesignature can be distributed over several images. For example, 1block/image can be marked and 10 images used to embed 10 bits. If the 32bits of the signature are used, the procedure starts again from thefirst bit, thus ensuring greater coding security, even if the copy isonly made on part of the video sequence.

Probabilistic Signature

Another technique that is much safer overcomes all the problems relatingto synchronisation. This technique involves the following steps forincorporation of a signature S:

1. Select a vector according to the selection criteria described above.

2. Generate a random number A comprising the same number of bits as Susing a parameter of the video to initialise the random generator. Forexample, the magnitude of motion vector.

3. Modify one of the coefficients X or Y to mark this coefficient withbit <<1>> if more than half of the bits of number A are identical to S,i.e. d(S, A)<d(S, NOT(A)), or otherwise with bit <<0>>, where NOT is theBoolean Negation function, d( ) is the distance defined between thesignature S and the number A as: d(S,A)=Sum(Abs(i)−A(i)), Abs( ) beingthe absolute value function and Sum( ) the sum including all the bits ofthe signature.

It can be demonstrated that this algorithm enables the signature to beretrieved by statistical convergence.

This technique works irrespective of the spatial and temporal positionof the blocks in the video. This means that synchronisation is no longerrequired between the signature bits and the selected blocks. The onlyconsequence of deletion of parts of the image or of entire frames is areduction in reliability of the signature retrieved and an increase inthe number of frames to retrieve the signature.

Video data can be any data that are easily accessible in the videosignal (value of motion vectors, DC coefficients, etc.).

FIG. 1 shows an example of implementation of the signature method as perthe invention. An initial entropic decoding of the video signal(bitstream) is implemented and the motion vectors then extracted. Afteran initial selection in accordance with the visual criteria detailedabove, a second selection dependent on the key is made. The coefficientsof the vectors selected are modified by the direct method or theprobabilistic method described above. Entropic coding with the newvectors enables the video signal compressed to the initial MPEG formatto be obtained.

Signature in the Case of a Mixed Sequence of Inter/intra Coded Images

Clearly, in order to successfully achieve this operation, it isessential to have motion vectors. To this effect MPEG has three type offrames, type I (Intra) frames containing the image information withoutreference to past images, type B (Interpolated) frames and type P(Predicted) frames, the latter two types of frame containing informationexpressing the differences in relation to the preceding frame, andpossibly containing motion vectors X and Y.

The frames of a sequence are organised in groups of pictures (GOP), eachgroup always starting with a type I frame. A type I frame must bepresent in each group. In fact, if, for a given video sequence, onlytype B or P frames are transmitted expressing differences, this wouldcertainly be advantageous in terms of quantity of informationtransmitted, but would prevent any decoding in the event of the firstimage not being received.

A group of pictures (GOP) generally consists of eight to ten frames.This number of frames is not dictated by the MPEG standard. However, itdoes not generally exceed 16 to avoid impairment to transmissionreliability.

Example of groups of pictures (GOP):

I P B B P B B P B B I B B P B B P B B GOP 1 GOP 2

The signature identification method is based on the presence of motionvectors in the type P or B frames. As stated previously, it is thesemotion vectors which are modified to incorporate the signature. It maybe that, in certain stages such as set-up of a video sequence, thesignal only consists of type I frames and renders the signatureimpossible to detect momentarily, albeit still present in the videosignal. On the other hand, available band width requirements forre-encoding of such a sequence for transmission on an etched support(CD) or broadcast (TV) will necessarily generate reappearance of thedifferential coding.

It has already been shown that the number of frames in a group ofpictures (GOP) may vary depending on the compression module used. Inview of this, if a compressed video signal marked in accordance with themethod as per the invention is decompressed and then re-compressed witha different number of frames per group, certain frames will havecompletely different vectors from those present on marking.

Signal for marking with 8 frames per group:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 . . . I P B P P B P PI P P B B P P B I B P P

Signal for re-compression with 10 frames per group:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 . . . I P B P P B P PP B I P B P P B P P B B

It will be noted that the vectors encoded in frame 11 have disappearedas the new frame 11 is type I. This is a temporary phenomenon, however,as frame 12 will contain information of difference in relation to thepreceding frame and the motion vectors will thus reappear.

The method as per the invention further offers the great advantage ofenabling authentication in real time of a video signed by the method.Identification is simple as it makes direct use of the motion vectorscontained in the signal.

If, contrary to expectations, the signal to be authenticated no longercontains any motion vectors, it is still possible to decompress it toobtain a conventional video signal, and then re-compress it so that tomake the motion vectors reappear. Obviously in this case the possibilityof identifying a signature in real time is lost.

Other Advantages of the Method

The signature or the reading of the motion vectors requires negligiblecomputing capability compared to the coding operation or even decoding.This is due to the absence of mathematical transform as may be requiredwith other methods (DCT, estimation of movement, etc.). This has beenestablished experimentally with an MPEG-4 encoder. Its incorporationwithin coding systems thus has minimal impact on the cost price of theproduct. Comparison can be made with a system that cannot work directlyin the compressed range. A technique of this type would require a supercomputer, such as Cray type, to mark a video with an insufficientthroughput of 5 frames/second.

The signature only entails minimal modification to the size of thecompressed video. In the case of DCT-based systems, considerably morecomplex procedures are involved to limit the increase in digitalthroughput relating to incorporation of the signature. The method as perthe invention can thus be incorporated in a video transmission line suchthat it has not significant impact on the transmission time or rate.

Motion vectors intrinsically relate to movement in the video itself.Consequently, different alignment of the image, such as lateraldisplacement of two bits, cannot destroy the signature. This has beentested with shifts of a few pixels and rotation of 2 to 3 degrees.

The method is exceptionally robust under compression. This is due to thefact that compression of video images is done essentially byquantification of the DFD. If this quantification reduces the energy ofthe DFD, the reliability of extraction of the signature is furtherreinforced. (Tests show robustness to transmission rate of 1 Mbit/s, inCCIR601 10 frames/s).

What is claimed is:
 1. Method for marking a compressed digital video signal including a bitstream part encoding a P-type or B-type frame by incorporating a digital signature embedded in the compressed video signal, without computing motion estimation and without computing motion compensation, the said signal representing a series of at least two video images, each of the images being divided into a plurality of regions, the said signal comprising motion vectors representing the movement of the regions between the first and second image, the method comprising the steps of: Selection Selecting from the signal of the bitstream part encoding a P-type or B-type frame; Entropy decoding of the bitstream; Extracting the motion vectors; Selecting a set of the motion vectors according to visual criterion; Selecting a subset of said selected set of motion vectors according to a key; Modifying at least one of the coefficient X or Y on at least one of the motion vectors of said selected set of motion vectors; Inserting the modified motion vectors in the bitstream; Entropy encoding of the bitstream; and Inserting the bitstream in the video signal.
 2. Method as per claim 1, characterised in that a set of motion vectors MV(i) is selected with low visual impact.
 3. Method as per claim 2, characterised in that a set of vectors is selected with the magnitude below a threshold R.
 4. Method as per claim 2, characterised in that the said threshold R is equal to
 5. 5. Method as per claim 2, charaterised in that a set of vectors is selected from a frame preceding a frame containing all the image information (type I frame).
 6. Method as per claim 2, characterised in that the set of motion vectors MV(i) is modified according to a pseudo-random selection modulated by a code.
 7. Method as per claim 6, characterised in that the signature comprises a plurality of bits S(i) and in that at least one of the said coefficients X or Y of the motion vectors is modified according to at least one of the signature bits.
 8. Method as per claim 7, characterised in that it further includes the following steps: generation of a random number A initialised by a parameter issued from the video signal, comprising the same number of bits as S; modification of one of the coeffecients X or Y such that this is signed with bit <<1>> if more than half the A bits are identical to S or with bit <<0>> if the reverse applies.
 9. Method as per claim 8, characterised in that the parameter initialising the random number is a mathematical combination of the magnitude of the motion vector to be modified.
 10. Method as per claim 1, characterised in that a region consists of one block.
 11. Method as per claim 1, characterised in that the coefficients X or Y of the motion vectors are modified between the second and third frame in inverse proportion to the modification made between the first and second frame.
 12. Method as per claim 2, characterised in that the coefficients X or Y or the motion vectors are modified between the second and third frame in inverse proportion to the modification made between the first and second frame.
 13. Method as per claim 3, characterised in that the coefficients X or Y of the motion vectors are modified between the second and third frame in inverse proportion to the modification made between the first and second frame.
 14. Method as per claim 4, characterised in that the coefficients X or Y or the motion vectors are modified between the second and third frame in inverse proportion to the modification made between the first and second frame.
 15. Method as per claim 5, characterised in that the coefficients X or Y of the motion vectors are modified between the second and third frame in inverse proportion to the modification made between the first and second frame.
 16. Method as per claim 6, characterised in that the coefficients X or Y of the motion vectors are modified between the second and third frame in inverse proportion to the modification made between the first and second frame.
 17. Method as per claim 7, characterised in that the coefficients X or Y of the motion vectors are modified between the second and third frame in inverse proportion to the modification made between the first and second frame.
 18. Method as per claim 8, characterised in that the coefficients X or Y of the motion vectors are modified between the second and third frame in inverse proportion to the modification made between the first and second frame.
 19. Method as per claim 9, characterised in that the coefficients X or Y of the motion vectors are modified between the second and third frame in inverse proportion to the modification made between the first and second frame.
 20. Method as per claim 10, characterised in that the coefficients X or Y of the motion vectors are modified between the second and third frame in inverse proportion to the modification made between the first and second frame.
 21. Method as per claim 1, wherein the compressed digital video signal is obtained by an MPEG-like technique. 